Big Data Ethics
   HOME

TheInfoList



OR:

Big data ethics also known as simply data ethics refers to systemizing, defending, and recommending concepts of right and wrong conduct in relation to
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
, in particular
personal data Personal data, also known as personal information or personally identifiable information (PII), is any information related to an identifiable person. The abbreviation PII is widely accepted in the United States, but the phrase it abbreviates ha ...
. Since the dawn of the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
the sheer quantity and quality of data has dramatically increased and is continuing to do so exponentially.
Big data Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated with large body of information that we could not comprehend when used only in smaller am ...
describes this large amount of data that is so voluminous and complex that traditional data processing application software is inadequate to deal with them. Recent innovations in medical research and healthcare, such as high-throughput genome sequencing, high-resolution imaging, electronic medical patient records and a plethora of internet-connected health devices have triggered a
data deluge The information explosion is the rapid increase in the amount of published information or data and the effects of this abundance. As the amount of available data grows, the problem of managing the information becomes more difficult, which can lead ...
that will reach the exabyte range in the near future. Data ethics is of increasing relevance as the quantity of data increases because of the scale of the impact. Big data ethics are different from
information ethics Information ethics has been defined as "the branch of ethics that focuses on the relationship between the creation, organization, dissemination, and use of information, and the ethical standards and moral codes governing human conduct in society". I ...
because the focus of information ethics is more concerned with issues of
intellectual property Intellectual property (IP) is a category of property that includes intangible creations of the human intellect. There are many types of intellectual property, and some countries recognize more than others. The best-known types are patents, cop ...
and concerns relating to librarians, archivists, and information professionals, while big data ethics is more concerned with collectors and disseminators of
structured Structuring, also known as smurfing in banking jargon, is the practice of executing financial transactions such as making bank deposits in a specific pattern, calculated to avoid triggering financial institutions to file reports required by law ...
or
unstructured data Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, num ...
such as
data brokers A data broker is an individual or company that specializes in collecting personal data (such as income, ethnicity, political beliefs, or geolocation data) or data about companies, mostly from public records but sometimes sourced privately, and s ...
, governments, and large corporations.


Principles

Data ethics is concerned with the following principles: * ''Ownership'' - Individuals own their own data * ''Transaction transparency'' - If an individual's personal data is used, they should have transparent access to the algorithm design used to generate aggregate data sets * ''Consent'' - If an individual or legal entity would like to use personal data, one needs informed and explicitly expressed consent of what personal data moves to whom, when, and for what purpose from the owner of the data * ''Privacy'' - If data transactions occur all reasonable effort needs to be made to preserve privacy * ''Currency'' - Individuals should be aware of financial transactions resulting from the use of their personal data and the scale of these transactions * ''Openness'' - Aggregate data sets should be freely available


Ownership

Ownership of data involves determining rights and duties over property. The concept of data ownership is linked to the ability to exercise control over and limit the sharing of personal data. The question of ownership arises when one person records their observations on another person. The observer and the observed both state a claim. Questions also arise as to the responsibilities that the observer and the observed have in relation to each other. Since the massive scale and systematisation of observation of people and their thoughts as a result of the Internet, these questions are increasingly important to address. The question of personal data ownership falls into an unknown territory in between corporate ownership, intellectual property, and slavery. The question of ownership of a
digital identity A digital identity is information used by computer systems to represent an external agent – a person, organization, application, or device. Digital identities allow access to services provided with computers to be automated and make it possibl ...
. European laws, the
General Data Protection Regulation The General Data Protection Regulation (GDPR) is a European Union regulation on data protection and privacy in the EU and the European Economic Area (EEA). The GDPR is an important component of EU privacy law and of human rights law, in partic ...
, indicate that individuals own their own personal data.


Transaction transparency

Concerns have been raised around how biases can be integrated into algorithm design resulting in systematic oppression. In terms of governance, big data ethics is concerned with which types of inferences and predictions should be made using big data technologies such as algorithms. Anticipatory governance is the practice of using
predictive analytics Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. In business ...
to assess possible future behaviours. This has ethical implications because it affords the ability to target particular groups and places which can encourage prejudice and discrimination For example,
predictive policing Predictive policing is the usage of mathematics, predictive analytics, and other analytical techniques in law enforcement to identify potential criminal activity. A report published by the RAND Corporation identified four general categories pred ...
highlights certain groups or neighbourhoods which should be watched more closely than others which leads to more sanctions in these areas, and closer surveillance for those who fit the same profiles as those who are sanctioned. The term "control creep" refers to data that has been generated with a particular purpose in mind but which is repurposed. This practice is seen with airline industry data which has been repurposed for profiling and managing security risks at airports.


Privacy

Privacy has been presented as a limitation to data usage which could also be considered unethical. For example, the sharing of healthcare data can shed light on the causes of diseases, the effects of treatments, an can allow for tailored analyses based on individuals' needs. This is of ethical significance in the big data ethics field because while many value privacy, the affordances of data sharing are also quite valuable, although they may contradict one's conception of privacy. Attitudes against data sharing may be based in a perceived loss of control over data and a fear of the exploitation of personal data. However, it is possible to extract the value of data without compromising privacy. Some scholars such as Jonathan H. King and Neil M. Richards are redefining the traditional meaning of privacy, and others to question whether or not privacy still exists. In a 2014 article for the ''
Wake Forest Law Review The ''Wake Forest Law Review'' is a law journal edited and published by students at the Wake Forest University School of Law. Rankings In 2013, the ''Wake Forest Law Review'' was ranked 40th overall among American law reviews by the ''Washingt ...
'', King and Richard argue that privacy in the digital age can be understood not in terms of secrecy but in term of regulations which govern and control the use of personal information. In the European Union, the right to be forgotten entitles EU countries to force the removal or de-linking of personal data from databases at an individual's request if the information is deemed irrelevant or out of date. According to Andrew Hoskins, this law demonstrates the moral panic of EU members over the perceived loss of privacy and the ability to govern personal data in the digital age. In the United States, citizens have the right to delete voluntarily submitted data. This is very different from the right to be forgotten because much of the data produced using big data technologies and platforms are not voluntarily submitted.


How much data is worth

The difference in value between the services facilitated by tech companies and the equity value of these tech companies is the difference in the exchange rate offered to the citizen and the "market rate" of the value of their data. Scientifically there are many holes in this rudimentary calculation: the financial figures of tax-evading companies are unreliable, either revenue or profit could be more appropriate, how a user is defined, a large number of individuals are needed for the data to be valuable, possible tiered prices for different people in different countries, etc. Although these calculations are crude, they serve to make the monetary value of data more tangible. Another approach is to find the data trading rates in the black market. RSA publishes a yearly cybersecurity shopping list that takes this approach. This raises the economic question of whether free tech services in exchange for personal data is a worthwhile implicit exchange for the consumer. In the personal data trading model, rather than companies selling data, an owner can sell their personal data and keep the profit.


Openness

The idea of open data is centred around the argument that data should be freely available and should not have restrictions that would prohibit its use, such as copyright laws. many governments had begun to move towards publishing open datasets for the purpose of transparency and accountability. This movement has gained traction via "open data activists" who have called for governments to make datasets available to allow citizens to themselves extract meaning from the data and perform checks and balances themselves. King and Richards have argued that this call for transparency includes a tension between openness and secrecy. Activists and scholars have also argued that because this open-sourced model of data evaluation is based on voluntary participation, the availability of open datasets has a democratizing effect on a society, allowing any citizen to participate. To some, the availability of certain types of data is seen as a right and an essential part of a citizen's agency.
The Open Knowledge Foundation Open Knowledge Foundation (OKF) is a global, non-profit network that promotes and shares information at no charge, including both content and data. It was founded by Rufus Pollock on 20 May 2004 in Cambridge, UK. It is incorporated in England an ...
(OKF) lists several dataset types that should be provided by governments in order for them to truly be open. The OFK has a tool called The Global Open Data Index (GODI) which is a crowd-sourced survey for measuring the openness of governments, according to the
Open Definition The Open Definition is a document published by the Open Knowledge Foundation (OKF) (previously Open Knowledge International) to define openness in relation to data and content. It specifies what licences for such material may and may not stipula ...
. The aim of the GODI is to provide a tool for providing important feedback to governments about the quality of their open datasets. Willingness to share data varies from person to person. Preliminary studies have been conducted into the determinants of the willingness to share data. For example, some have suggested that baby boomers are less willing to share data than millennials.


The role of institutions


Nation states

Data sovereignty refers to a government's control over the data that is generated and collected within a country. The issue of data sovereignty was heightened when Edward Snowden leaked US government information about a number of governments and individuals whom the US government was spying on. This prompted many governments to reconsider their approach to data sovereignty and the security of their citizens' data. J. De Jong-Chen points out how the restriction of data flow can hinder scientific discovery, to the disadvantage of many but particularly, developing countries. This is of considerable concern to big data ethics because of the tension between the two important issues of cybersecurity and global development.


See also

*
Dynamic consent Dynamic consent is an approach to informed consent that enables on-going engagement and communication between individuals and the users and custodians of their data. It is designed to address the many issues that are raised by the use of digital t ...


Footnotes


References

* * * *Hoskins, A. (November 4, 2014). "Digital Memory Studies". www.memorystudies-frankfurt.com. Retrieved 2017-11-28. * *Kitchin, R. The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences, (pp. 165–183). ''SAGE Publications''. Kindle Edition. * * * * * * * {{refend Big data Data Ethics Internet privacy